Supplementary Material A: Hierarchical Bayesian Models to describe differences in isotopic values between and within species.

Contents

Imports

Miscelaneous configurations

Exploratory Data Analyses:

Data description

In the following plot we can observe that:

  1. There is an isotopic gradient between the three species in both isotopes: R. steindachneri < H. dipterus < N. entemedor.

  2. This gradient is consistent, to different extents, through the covariates, causing deviations from the statistic normality, especially in multivariate terms.

  3. Some classes are misrepresented in relation to anothers

It is important to mention that the year variable has two levels only for R. steindachneri and will, hence, be discarded from the analysis.

Justification of methods

Frequentist null hypothesis testing has been useful in ecological studies; however, it has been suggested that inferences should, instead, be made from models, likelihodd ratios or in probabilistic terms (i.e. Bayesian Inference, Gerrodette 2011; Ellison 2004; Hobbs & Hilborn 2006; Armhein et al. 2019); hence, analyses were based on Bayesian Inference. In general, Bayesian Inference consists on the reallocation of credibility among a space of candidate possibilities, using the Bayes' Theorem to evaluate the credibility of a parameter given the data and prior knowledge of the parameter (Bolstad 2004; Kruschke 2015). Details on the implementation of the models and the sampling of the posterior distributions are given in each section; however, each model was run until convergence; i.e., 0 divergences during the posterior sampling and Gelman-Rubick statistics equal to 1.0 for every parameter. Other graphical diagnostics such as Posterior Predictive Checks and energy plots (Betancourt 2017, Gabry et al. 2018) for Hamiltonian Monte Carlo-based samplers are included in the supplementary material. Every sample after convergence (posterior) was kept. This approach was taken since thinning the posterior sample is unadvised, unless there are computational restrictions, due to a reduction in the precision of the estimates (see Link and Eaton 2011 and references therein, Stan Development Team 2019). Still, the number of posterior samples for each model was dependent on the number of independent samples (effective sample sizes, ess) being over 2000, both for bulk and tail ess (affecting the reliability of the median and highest density intervals, respectively).

Comparisons of means and effects of predictors

The isotopic spaces of the three species were described using a custom hierarchichal bivariate model, in which the effects of the climatic seasons (warm vs. cold), sexes and age categories (juveniles vs. adults) on the isotopic values are nested within each species, meaning that the isotopic space of each species is the result of two linear models (one per isotopic ratio) of the covariates. This model was implemented using the pymc3 library (v. 3.11.2, Salvatier et al. 2016) in Python 3 (v. 3.8.2, Van Rossum and Drake 2009; code available in GITHUBREPO), with three chains that were tuned for 25,000 iterations and a posterior sample of 2,000 iterations (draws). The model was specified as follows, where $j$ represents the isotopic ratio, $i$ the species, $a$ the intercept (i.e. the mean isotopic value) and $b$ the slopes of the regression (i.e. the difference between both levels of the covariates):

This parametrization obeyed the following reasons: i) a bivariate model allows accounting for the covariation between bulk isotopic values, which is relevant since these depend both on the isotopic baseline and the trophic level; hence, having a joint distribution that is not orthogonal; and, ii) the distributions used have heavier tails than a normal distribution, which assigns a higher probability to extreme values and, thus, allows to make robust estimations of the parameters (Kruschke 2012).

Graph of the hierarchical model.

Diagnostics

Supplemmentary material Table X. Summary statistics of the posterior distributions of the parameters. Gelman-Statistic values equal to 1.0 and Effective Sample Sizes (both Bulk and Tail) over 2000 for every parameter.

Posterior Predictive Checks for the likelihood model. The observed distribution (black) is between the posterior predictive samples (blue). Each posterior predictive sample consists on a set with the same number of observations as the original data, generated based on a parameter from the posterior samples.

Energy plot and Bayesian Fraction of Missing Information (BFMI). The energy plot shows similar densities between the marginal and transition energies from the Hamiltonian simulation, meaning that the sampler was not stuck in a particular area of the posterior distribution, which is validated through the BFMI values (Betancourt 2017).

Effective sample sizes, autocorrelations and no-thinning.

The autocorrelation (i.e., the correlation between the $sample_i$ and the $sample_{i-1}$) directly affects the effective sample size (ess) estimation (higher autocorrelations, lower ess). For this reason, a very common practice is to thin the posterior sample (i.e. keep only 1 out of every k-th iteration after convergence); however, this is unadvised since it leads to a reduction in the precision of the estimates (Link y Eaton 2011 and references therein), making it feasible only to reduce computational requirements (STAN reference manual). Moreover, one of the advantages of using NUTS is that the samples can be uncorrelated, since the new state of the chain depends more on the probability "surface" rather than the previous position of the particle. Shown in the following autocorrelation plot:

Parallel coordinates plots for each parameter of interest within the model. Shows that every sample of the posterior distribution was non-divergent. A) Mean values for each species and B) Slopes for each covariate for each species; where: 0: H. dipterus, 1: N. entemedor, 2: R. steindachneri

A note on the interpretation of results

It is important to understand the interpretation of the slope of a categorical covariate before going to the description of the results. Consider a univariate model with a binary covariate ($X$) such as the ones of this model:

$Y = \beta_0 + \beta_1*X + \epsilon$

$\therefore$

If $X = 0$: $Y = \beta_0 + \beta_1*0 + \epsilon \implies Y = \beta_0 + \epsilon$

If $X = 1$: $Y = \beta_0 + \beta_1*1 + \epsilon \implies Y = \beta_0 + \beta_1 + \epsilon$

I.e., the slope represents the difference between the means of both classes. This is demonstrable if we substract both equations ($\Delta_Y$):

$Y = \beta_0 + \epsilon = \beta_0 + \beta_1 + \epsilon \implies \Delta_Y = \beta_0 - \beta_0 - \beta_1 + \epsilon - \epsilon \therefore \Delta_Y = -\beta_1$

Taking this into account, the slopes in this work represent the difference (in ‰) between the two classes being analysed (e.g. females vs. males).

Results

The blue trace corresponds to H. dipterus, the orange to N. entemedor and the green to R. steindachneri.

  1. $\mu$: Posterior distribution of the means of each species for each isotope. They are among the higher levels on the hierarchy, hence including the uncertainty in the other estimates:

    • $\mu_{\delta^{13}C}$: Shows the separation between species in $\delta^{13}C$. H. dipterurus shows intermediate values among the other two species.

    • $\mu_{\delta^{15}N}$: H. dipterurus and R. steindachneri have similar values, lower than those of N. entemedor-

  1. Sex_: Slope for the sexes of each species for each isotope.
    • Codes: Male = 1, Female = 0.
    • The difference between sexes is small ($\approx 0$‰).
  1. Temp_: Slope for the season:
    • Codes: Cold = 1, Warm = 0.
    • H. dipterus: Difference between seasons of $\approx 1$‰ in $\delta^{13}C$ and $\approx 2$‰ in $\delta^{15}N$.
    • N. entemedor: $\Delta^{15}N \in (-1,0)$‰
    • The posterior distributions of R. steindachneri are very broad, possibly due to a sample size of three for the cold season.
  1. Age_: Slope value for the age category of each isotope for each species:
    • Codes: Juveniles = 1, Adults = 0.
    • H. dipterus $\Delta^{13}C \in (0,2)$, $\Delta^{15}N \in (-1.5,0]$
    • N. entemedor $\Delta^{15}N \in [-1.5,0)$
    • R. steindachneri: $\Delta^{13}C \in (-1,0)$

Reference: Labels are assigned alphabetically:

Inference

Forest plot of the effects of the covariates on the $\delta^{15}N$ values. Horizontal lines represent the 95% Highest Density Intervals (HDI). Vertical reference line in 0.

Forest plot of the effects of the covariates on the $\delta^{13}C$ values. Horizontal lines represent the 95% Highest Density Intervals (HDI). Vertical reference line in 0.

We can also directly evaluate the $P(\beta \neq 0)$, based on that if $P(\beta < 0)$ or $P(\beta > 0) \approx 50\%$ the $P(\beta \neq 0)$ is very low (50% of the distribution on either side of 0).

Sexes

In general, the effect is small, with probabilities smaller than 75% in most cases. The only exception was in $\delta^{13}C$ for H. dipterus with $P(\beta < 0) \approx 90\%$, suggesting that females possibly had more coastal habits than males ($\bar{\beta} = -0.32$‰).

Season

The effect of the season was more evident, with $P(\beta < 0 \lor \beta > 90)\%$ in most comparisons:

The lowest $P(\beta < 0 \lor \beta > 90)\%$ were found for N. entemedor ($P \approx 70\%$) and R. steindachneri ($P \approx 60%$) in $\delta^{13}C$.

Age

There were high probabilities of differences between age categories, with the exception of R. steindachneri en $\delta^{15}N$ ($P(\beta>0) \approx 56\%$). The rest of the comparisons had similar trends, in the sense that juveniles had lower values in both isotopes, suggesting more oceanic habitats and lower trophic positions. The only exception was H. dipterus in $\delta^{13}C$, whose juveniles had more coastal habits (less negative values).

Isotopic mean comparison between species

Since in the process we also estimated the posterior distributions of the means for each species, we can also directly compare them by simply substracting them:

In the following figures we can observe that probabilities of mean differences were high in every comparison. Summarising:

Joint Posterior Distribution

Given that both isotopes were modeled at the same time, we can plot the joint posterior mean distribution, which is non-orhtogonal as if both isotopes were independent from one another (separate Generalized Linear Models or independent null hypothesis testing). It is important to remark that the means are not in the centroid of each group. This is a favorable consequence of having used Student-T distributions for the parameters and likelihood instead of normal distributions. By stablishing that extreme values have a higher probability than in a normal distribution, the effect of the more extreme values is disregarded or, in another words, in a normal distribution they have low probabilities and, hence, each point is given a heavier weight.

This consequence is why using heavy-tailed distributions is called robust regression. In this example the effect is shown in a linear regression with a continuous covariate, in which the robust estimation of the parameters is not "deceived" by the more extreme values on the y variable and posterior predictive lines are closer to the real line than those of the "normal" regression.